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          Media Subtype Registration for Media Type text/troff

Status of This Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   A text media subtype for tagging content consisting of juxtaposed
   text and formatting directives as used by the troff series of
   programs and for conveying information about the intended processing
   steps necessary to produce formatted output is described.
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1.  Introduction

   It is sometimes desirable to format text in a particular way for
   presentation.  One approach is to provide formatting directives in
   juxtaposition to the text to be formatted.  That approach permits
   reading the text in unformatted form (by ignoring the formatting
   directives), and it permits relatively simple repurposing of the text
   for different media by making suitable alterations to the formatting
   directives or the environment in which they operate.  One particular
   series of related programs for formatting text in accordance with
   that model is often referred to generically as "troff", although that
   is also the name of a particular lineage of programs within that
   generic category for formatting text specifically for typesetter and
   typesetter-like devices.  A related formatting program within the
   generic "troff" category, usually used for character-based output
   such as (formatted) plain text, is known as "nroff".  For the purpose
   of the media type defined here, the entire category will be referred
   to simply by the generic "troff" name.  Troff as a distinct set of
   programs first appeared in the early 1970s [N1.CSTR54], based on the
   same formatting approach used by some earlier programs ("runoff" and
   "roff").  It has been used to produce documents in various formats,
   ranging in length from short memoranda to books (including tables,
   diagrams, and other non-textual content).  It remains in wide use as
   of the date of this document; this document itself was prepared using
   the troff family of tools per [I1.RFC2223] and [I2.Lilly05].

   The basic format (juxtaposed text and formatting directives) is
   extensible and has been used for related formatting of text and
   graphical document content.  Formating is usually controlled by a set
   of macros; a macro package is a set of related formatting tools,
   written in troff format (although compressed binary representations
   have also been used) and using basic formatting directives to extend
   and manage formatting capabilities for document authors.  There are a
   number of preprocessors that transform a textual description of some
   content into the juxtaposed text and formatting directives necessary
   to produce some desired output.  Preprocessors exist for formatting
   of tables of text and non-textual material, mathematical equations,
   chemical formulae, general line drawings, graphical representation of
   data (in plotted coordinate graphs, bar charts, etc.),
   representations of data formats, and representations of the abstract
   mathematical construct known as a graph (consisting of nodes and
   edges).  Many such preprocessors use the same general type of input
   format as the formatters, and such input is explicitly within the
   scope of the media type described in this document.
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2.  Requirement Levels

   The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT",
   "RECOMMENDED", and "MAY" in this document are to be interpreted as
   described in [N2.BCP14].

3.  Scope of Specification

   The described media type refers to input that may be processed by
   preprocessors and by a page formatter.  It is intended to be used
   where content has some text that may be comprehensible (either as
   text per se or as a readable description of non-text content) without
   machine processing of the content.  Where there is little or no
   comprehensible text content, this media type SHOULD NOT be used.  For
   example, while output of the "pic" preprocessor certainly consists of
   troff-compatible sequences of formatting directives, the sheer number
   of individual directives interspersed with any text that might be
   present makes comprehension difficult, whereas the preprocessor input
   language (as described in the "Published Specification" section of
   the registration below) may provide a concise and comprehensible
   description of graphical content.  Preprocessor output that includes
   a large proportion of formatting directives would best be labeled as
   a subtype of the application media type.  If particular preprocessor
   input content describes only graphical content with little or no
   text, and which is not readily comprehensible from a textual
   description of the graphical elements, a subtype of the image media
   type would be appropriate.  The purpose of labeling media content is
   to provide information about that content to facilitate use of the
   content.  Use of a particular label requires some common sense and
   judgment, and SHOULD NOT be mechanically applied to content in the
   absence of such judgment.

4.  Registration Form

   The registration procedure and form are specified in [I3.RFC4288].

   Type name: text

   Subtype name: troff

   Required parameters: none

   Optional parameters:

      charset: Must be a charset registered for use with MIME text types
         [N3.RFC2046], except where transport protocols are explicitly
         exempted from that restriction.  Specifies the charset of the
         media content.  With traditional source content, this will be
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         the default "US-ASCII" charset.  Some recent versions of troff
         processing software can handle Unicode input charsets; however,
         there may be interoperability issues if the input uses such a
         charset (see "Interoperability considerations" below).

      process: Human-readable additional information for formatting,
         including environment variables, preprocessor arguments and
         order, formatter arguments, and postprocessors.  The parameter
         value may need to be quoted or encoded as provided for by
         [N4.RFC2045] as amended by [N5.RFC2231] and [N6.Errata].
         Generating implementations must not encode executable content
         and other implementations must not attempt any execution or
         other interpretation of the parameter value, as the parameter
         value may be prose text.  Implementations SHOULD present the
         parameter (after reassembly of continuation parameters, etc.)
         as information related to the media type, particularly if the
         media content is not immediately available (e.g., as with
         message/external-body composite media [N3.RFC2046]).

      resources: Lists any additional files or programs that are
         required for formatting (e.g., via .cf, .nx, .pi, .so, and/or
         .sy directives).

      versions: Human-readable indication of any known specific versions
         of preprocessors, formatter, macro packages, postprocessors,
         etc., required to process the content.

   Encoding considerations:

      7bit is adequate for traditional troff provided line endings are
         canonicalized per [N3.RFC2046].  Transfer of this media type
         content via some transport mechanisms may require or benefit
         from encoding into a 7bit range via a suitable encoding method
         such as the ones described in [N4.RFC2045].  In particular,
         some lines in this media type might begin or end with
         whitespace, and that leading and/or trailing whitespace might
         be discarded or otherwise mangled if the media type is not
         encoded for transport.

      8bit may be used with Unicode characters represented as a series
         of octets using the utf-8 charset [I4.RFC3629], where transport
         methods permit 8bit content and where content line length is
         suitable.  Transport encoding considerations for robustness may
         also apply, and if a suitable 8bit encoding mechanism is
         standardized, it might be applicable for protection of media
         during transport.
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      binary may be necessary when raw Unicode is used or where line
         lengths exceed the allowable maximum for 7bit and 8bit content
         [N4.RFC2045], and may be used in environments (e.g., HTTP
         [I5.RFC2616]) where Unicode characters may be transferred via a
         non-MIME charset such as UTF-16 [I6.RFC2781].

      framed encoding MAY be used, but is not required and is not
         generally useful with this media type.

   Restrictions on usage: none

   Security considerations: Some troff directives (.sy and .pi) can
      cause arbitrary external programs to be run.  Several troff
      directives (.so, .nx, and .cf) may read external files (and/or
      devices on systems that support device input via file system
      semantics) during processing.  Several preprocessors have similar
      features.  Some implementations have a "safe" mode that disables
      some of these features.  Formatters and preprocessors are
      programmable, and it is possible to provide input which specifies
      an infinite loop, which could result in denial of service, even in
      implementations that restrict use of directives that access
      external resources.  Users of this media type SHOULD be vigilant
      of the potential for damage that may be caused by careless
      processing of media obtained from untrusted sources.

      Processing of this media type other than by facilities that strip
      or ignore potentially dangerous directives, and processing by
      preprocessors and/or postprocessors, SHOULD NOT be invoked
      automatically (i.e., without user confirmation).  In some cases,
      as this is a text media type (i.e., it contains text that is
      comprehensible without processing), it may be sufficient to
      present the media type with no processing at all.  However, like
      any other text media, this media type may contain control
      characters, and implementers SHOULD take precautions against
      untoward consequences of sending raw control characters to display
      devices.

      Users of this media type SHOULD carefully scrutinize suggested
      command lines associated with the "process" parameter, contained
      in comments within the media, or conveyed via external mechanisms,
      both for attempts at social engineering and for the effects of
      ill-considered values of the parameter.  While some
      implementations may have "safe" modes, those using this media type
      MUST NOT presume that they are available or active.

      Comments may be included in troff source; comments are not
      formatted for output.  However, they are of course readable in the
      troff document source.  Authors should be careful about
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      information placed in comments, as such information may result in
      a leak of information, or may have other undesirable consequences.

      While it is possible to overlay text with graphics or otherwise
      produce formatting instructions that would visually obscure text
      when formatted, such measures do not prevent extracting text from
      the document source, and might be ineffective in obscuring text
      when formatted electronically, e.g., as PostScript or PDF.

   Interoperability considerations: Recent implementations of
      formatters, macro packages, and preprocessors may include some
      extended capabilities that are not present in earlier
      implementations.  Use of such extensions obviously limits the
      ability to produce consistent formatted output at sites with
      implementations that do not support those extensions.  Use of any
      such extensions in a particular document using this media type
      SHOULD be indicated via the "versions" parameter value.

      As mentioned in the Introduction, macro packages are troff
      documents, and their content may be subject to copyright.  That
      has led to multiple independent implementations of macro packages,
      which may exhibit gross or subtle differences with some content.

      Some preprocessors or postprocessors might be unavailable at some
      sites.  Where some implementation is available, there may be
      differences in implementation that affect the output produced.
      For example, some versions of the "pic" preprocessor provide the
      capability to fill a bounded graphical object; others lack that
      capability.  Of those that support that feature, there are
      differences in whether a solid fill is represented by a value of
      0.0 vs. 1.0.  Some implementations support only gray-scale output;
      others support color.

      Preprocessors or postprocessors may depend on additional programs
      such as awk, and implementation differences (including bugs) may
      lead to different results on different systems (or even on the
      same system with a different environment).

      There is a wide variation in the capabilities of various
      presentation media and the devices used to prepare content for
      presentation.  Indeed, that is one reason that there are two basic
      formatter program types (nroff for output where limited formatting
      control is available, and troff where a greater range of control
      is possible).  Clearly, a document designed to use complex or
      sophisticated formatting might not be representable in simpler
      media or with devices lacking certain capabilities.  Often it is
      possible to produce a somewhat inferior approximation; colors
      might be represented as gray-scale values, accented characters
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      might be produced by overstriking, italics might be represented by
      underlining, etc.

      Various systems store text with different line ending codings.
      For the purpose of transferring this media type between systems or
      between applications using MIME methods, line endings MUST use the
      canonical CRLF line ending per [N3.RFC2046].

   Published specification: [N1.CSTR54]

   Applications which use this media type: The following applications in
      each sub-category are examples.  The lists are not intended to be
      exhaustive.

      Preprocessors: tbl [I7.CSTR49], grap [I8.CSTR114], pic
         [I9.CSTR116], chem [I10.CSTR122], eqn [I11.eqn], dformat
         [I12.CSTR142]

      Formatters: troff, nroff, Eroff, sqtroff, groff, awf, cawf

      Format converters: deroff, troffcvt, unroff, troff2html, mm2html

      Macro packages: man [I13.UNIXman1], me [I14.me], mm
         [I15.DWBguide], ms [I16.ms], mv [I15.DWBguide], rfc
         [I2.Lilly05]

   Additional information:

      Magic number(s): None; however, the content format is distinctive
         (see "Published specification").

      File extension(s): Files do not require any specific "extension".
         Many are in use as a convenience for mechanized processing of
         files, some associated with specific macro packages or
         preprocessors; others are ad hoc.  File names are orthogonal to
         the nature of the content.  In particular, while a file name or
         a component of a name may be useful in some types of automated
         processing of files, the name or component might not be capable
         of indicating subtleties such as proportion of textual (as
         opposed to image or formatting directive) content.  This media
         type SHOULD NOT be assigned a relationship with any file
         "extension" where content may be untrusted unless there is
         provision for human judgment that may be used to override that
         relationship for individual files.  Where appropriate, a file
         name MAY be suggested by a suitable mechanism such as the one
         specified in [I17.RFC2183] as amended by [N5.RFC2231] and
         [N6.Errata].
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      Macintosh File Type Code(s): unknown

   Person & email address to contact for further information:
      Bruce Lilly
      blilly@erols.com

   Intended usage: COMMON

   Author/Change controller: IESG

   Consistency: The media has provision for comments; these are
      sometimes used to convey recommended processing commands, to
      indicate required resources, etc.  To avoid confusing recipients,
      senders SHOULD ensure that information specified in optional
      parameters is consistent with any related information that may be
      contained within the media content.

5.  Acknowledgements

   The author would like to acknowledge the helpful comments provided by
   members of the ietf-types mailing list.

6.  Security Considerations

   Security considerations are discussed in the media registration.
   Additional considerations may apply when the media subtype is used in
   some contexts (e.g., MIME [I18.RFC2049]).

7.  Internationalization Considerations

   The optional charset parameter may be used to indicate the charset of
   the media type content.  In some cases, that content's charset might
   be carried through processing for display of text.  In other cases,
   combinations of octets in particular sequences are used to represent
   glyphs that cannot be directly represented in the content charset.
   In either of those categories, the language(s) of the text might not
   be evident from the character content, and it is RECOMMENDED that a
   suitable mechanism (e.g., [I19.RFC3282]) be used to convey text
   language where such a mechanism is available [I20.BCP18].  Where
   multiple languages are used within a single document, it may be
   necessary or desirable to indicate the languages to readers directly
   via explicit indication of language in the content.  In still other
   cases, the media type content (while readable and comprehensible in
   text form) represents symbolic or graphical information such as
   mathematical equations or chemical formulae, which are largely global
   and language independent.
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8.  IANA Considerations

   IANA shall enter and maintain the registration information in the
   media type registry as directed by the IESG.
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Appendix A.  Examples

A.1.  Data Format

   The input:

      Content-Type: text/troff ; process="dformat | pic -n | troff -ms"

   Here's what an IP packet header looks like:

      .begin dformat
      style fill off
      style bitwid 0.20
      style recspread 0
      style recht 0.33333
      noname
       0-3 \0Version
       4-7 IHL
       8-15 \0Type of Service
       16-31 Total Length
      noname
       0-15 Identification
       16-18 \0Flags
       19-31 Fragment Offset
      noname
       0-7 Time to Live
       8-15 Protocol
       16-31 Header Checksum
      noname
       0-31 Source Address
      noname
       0-31 Destination Address
      noname
       0-23 Options
       24-31 Padding
      .end

   produces as output:

   Here's what an IP packet header looks like:
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   +-------+-------+---------------+-------------------------------+
   |Version| IHL   |Type of Service|         Total Length          |
   0------34------78-------------1516----+-----------------------31+
   |        Identification         |Flags|    Fragment Offset      |
   0---------------+-------------1516--1819----------------------31+
   | Time to Live  |   Protocol    |       Header Checksum         |
   0--------------78-------------1516----------------------------31+
   |                        Source Address                         |
   0-------------------------------------------------------------31+
   |                     Destination Address                       |
   0-----------------------------------------------+-------------31+
   |                   Options                     |   Padding     |
   0---------------------------------------------2324------------31+

A.2.  Simple Diagram

   The input:

      Content-Type: text/troff ; process="use pic -n then troff -ms"

   The SMTP design can be pictured as:

      .DS B
      .PS
      boxwid = 0.8
      # arrow approximation that looks acceptable in troff and nroff
      define myarrow X A: [ move right 0.055;\
       "<" ljust;line right ($1 - 0.1);">" rjust;\
       move right 0.045 ]\
      X
      User: box ht 0.333333 "User"
      FS: box ht 0.666667 "File" "System" with .n at User.s -0, 0.333333
      Client: box ht 1.333333 wid 1.1 "Client\-" "SMTP" \
       with .sw at FS.se +0.5, 0
      "SMTP client" rjust at Client.se -0, 0.166667
      move to User.e ; myarrow(0.5)
      move to FS.e ; myarrow(0.5)
      move to Client.e ; SMTP: myarrow(1.8)
      Server: box ht 1.333333 wid 1.1 "Server\-" "SMTP" \
       with .sw at Here.x, Client.s.y
      box invis ht 0.5 "SMTP" "Commands/Replies" with .s at SMTP.c
      box invis ht 0.25 "and Mail" with .n at SMTP.c
      "SMTP server" ljust at Server.sw -0, 0.166667
      move to Server.e.x, FS.e.y ; myarrow(0.5)
      FS2: box ht 0.666667 "File" "System" \
       with .sw at Server.se.x +0.5, FS.s.y
      .PE
      .DE
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   produces as output:

   The SMTP design can be pictured as:

   +-------+    +----------+                 +----------+
   | User  |<-->+          |                 |          |
   +-------+    |          |      SMTP       |          |
                |          |Commands/Replies |          |
   +-------+    | Client-  |<--------------->+ Server-  |    +-------+
   |       |    |  SMTP    |    and Mail     |  SMTP    |    |       |
   | File  |<-->+          |                 |          |<-->+ File  |
   |System |    |          |                 |          |    |System |
   +-------+    +----------+                 +----------+    +-------+
                SMTP client                  SMTP server

Appendix B. Disclaimers

   This document has exactly one (1) author.

   In spite of the fact that the author's given name may also be the
   surname of other individuals, and the fact that the author's surname
   may also be a given name for some females, the author is, and has
   always been, male.

   The presence of "/SHE", "their", and "authors" (plural) in the
   boilerplate sections of this document is irrelevant.  The author of
   this document is not responsible for the boilerplate text.

   Comments regarding the silliness, lack of accuracy, and lack of
   precision of the boilerplate text should be directed to the IESG, not
   to the author.
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